More strictly speaking, the confidence level represents the frequency (i.e. the proportion) of possible confidence intervals that contain the true value of the unknown population parameter. In other words, if confidence intervals are constructed using a given confidence level from an infinite number of independent sample statistics, the proportion of those intervals that contain the true value of the parameter will be equal to the confidence level.
- Wikipedia
Create a function to compute a confidence interval assuming the population is normally distributed. By default, sd = NA and confidence_level = 0.95. The function will take a numeric vector x and produce a 95% confidence interval using a t-table value (since the population standard deviation is missing) with degrees of freedom equal to the length of x - 1.
When the population standard deviation is known and supplied, the calculation will be based on a standard normal table value.
confidence_interval <- function(x, sd = NA, confidence_level = 0.95) {
alpha <- 1 - confidence_level
if (!is.na(sd)) {
table_value <- qnorm(alpha/2, lower.tail = FALSE)
} else {
table_value <- qt(alpha/2, df = length(x) - 1, lower.tail = FALSE)
sd <- sd(x)
}
ci <- mean(x) + c(-1, 1) * table_value * sd / sqrt(length(x))
names(ci) <- c("Lower Bound", "Upper Bound")
ci
}
Now, let’s calculate some confidence intervals on known distributions.
n <- 5
true_mean <- 0
# Define the population - drawing from a standard normal distribution
data <- rnorm(n, true_mean, 1)
data
[1] -0.1164064 -0.5215424 -0.2472142 -1.2242537 -0.9047450
interval <- confidence_interval(data)
interval
Lower Bound Upper Bound
-1.17419961 -0.03146508
This is our confidence interval for the mean of the simulated data we just created. We can check to see if this interval includes the true_mean.
interval[1] < true_mean && true_mean < interval[2]
[1] FALSE
Now, let’s run it again!
data <- rnorm(n, true_mean, 1)
data
[1] 1.90885536 0.40135318 -0.16097588 0.41621101 -0.07489825
interval <- confidence_interval(data)
interval
Lower Bound Upper Bound
-0.5349756 1.5311938
Why are these confidence intervals different?
Simulation
Now we’ll simulate thousands of confidence intervals!
n_reps <- 10000
For Loop
contains <- numeric(n_reps)
intervals <- matrix(NA, nrow = n_reps, ncol = 2)
for (i in 1:n_reps) {
data <- rnorm(n, true_mean, 1)
interval <- confidence_interval(data)
intervals[i,] <- interval
contains[i] <- interval[1] < true_mean && true_mean < interval[2]
}
coverage <- mean(contains)
coverage
[1] 0.9497
Replicate
intervals <- replicate(n_reps, {
data <- rnorm(n, true_mean, 1)
confidence_interval(data)
})
contains <- apply(intervals, 2, prod) < 0
coverage <- mean(contains)
coverage
[1] 0.9452
Tidyverse
library(tidyverse)
[30m── [1mAttaching packages[22m ────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──[39m
[30m[32m✔[30m [34mggplot2[30m 3.1.0 [32m✔[30m [34mpurrr [30m 0.2.5
[32m✔[30m [34mtibble [30m 2.0.0 [32m✔[30m [34mdplyr [30m 0.7.8
[32m✔[30m [34mtidyr [30m 0.8.2 [32m✔[30m [34mstringr[30m 1.3.1
[32m✔[30m [34mreadr [30m 1.3.1 [32m✔[30m [34mforcats[30m 0.3.0[39m
package ‘tibble’ was built under R version 3.5.2[30m── [1mConflicts[22m ───────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[30m [34mdplyr[30m::[32mfilter()[30m masks [34mstats[30m::filter()
[31m✖[30m [34mdplyr[30m::[32mlag()[30m masks [34mstats[30m::lag()[39m
intervals <- rerun(n_reps, confidence_interval(rnorm(n, true_mean)))
contains <- map_lgl(intervals, ~prod(.) < 0)
coverage <- mean(contains)
coverage
[1] 0.9493
Look at that! Close to .95 - illustrating exactly what the definition of a confidence interval refers to. Now, since we arrived at this calculation via a simulation, we can put a confidence interval on our coverage estimate for confidence intervals. Now, note that in this case we’re creating a confidence interval for a proportion:
\[
p \pm Z_\alpha \sqrt{\frac{p(1-p)}{N}}
\]
coverage + c(-1, 1) * qnorm(0.975) * sqrt(coverage * (1 - coverage)/n_reps)
[1] 0.9450001 0.9535999
Now let’s get some visual intuition behind this:
intervals_df <- map_df(intervals, ~tibble(lower = .["Lower Bound"], upper = .["Upper Bound"])) %>%
mutate(index = 1:n(),
includes = lower <= true_mean & upper >= true_mean)
ggplot(head(intervals_df, 100), aes(x = lower, xend = upper, y = index, yend = index)) +
geom_segment(lineend = "butt", aes(col = includes), show.legend = FALSE, lwd = 2) +
geom_vline(xintercept = true_mean, col = "red", lwd = 1.5) +
theme_bw() +
labs(title = "Confidence Interval Coverage",
x = "",
y = "")

Expand the simulation
simulate <- function(ci_function, truth = 0, n_reps = 10000) {
# ci_function returns a confidence interval for a given set of data
cov <- apply(replicate(n_reps, ci_function() - truth), 2, prod) < 0
coverage <- mean(cov)
# Confidence interval for proportion
coverage + c(-1, 1) * qnorm(0.975) * sqrt(coverage * (1 - coverage)/n_reps)
}
simulate(function() confidence_interval(rnorm(5), sd = 1))
[1] 0.9424012 0.9511988
simulate(function() confidence_interval(rnorm(5, sd = 1)))
[1] 0.9491646 0.9574354
simulate(function() confidence_interval(rnorm(5, sd = 1), sd = 1.5))
[1] 0.9955759 0.9978241
sapply(c(5, 10, 50, 100, 1000), function(n_observations) {
simulate(function() confidence_interval(rnorm(n_observations)))
})
[,1] [,2] [,3] [,4] [,5]
[1,] 0.9427129 0.9448961 0.9496857 0.9457284 0.9457284
[2,] 0.9514871 0.9535039 0.9579143 0.9542716 0.9542716
LS0tCnRpdGxlOiAiQ29uZmlkZW5jZSBJbnRlcnZhbHMiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoY2FjaGUgPSBUUlVFKQpgYGAKCj4gTW9yZSBzdHJpY3RseSBzcGVha2luZywgdGhlIGNvbmZpZGVuY2UgbGV2ZWwgcmVwcmVzZW50cyB0aGUgZnJlcXVlbmN5IChpLmUuIHRoZSAKcHJvcG9ydGlvbikgb2YgcG9zc2libGUgY29uZmlkZW5jZSBpbnRlcnZhbHMgdGhhdCBjb250YWluIHRoZSB0cnVlIHZhbHVlIG9mIHRoZSAKdW5rbm93biBwb3B1bGF0aW9uIHBhcmFtZXRlci4gSW4gb3RoZXIgd29yZHMsIGlmIGNvbmZpZGVuY2UgaW50ZXJ2YWxzIGFyZSBjb25zdHJ1Y3RlZAp1c2luZyBhIGdpdmVuIGNvbmZpZGVuY2UgbGV2ZWwgZnJvbSBhbiBpbmZpbml0ZSBudW1iZXIgb2YgaW5kZXBlbmRlbnQgc2FtcGxlIApzdGF0aXN0aWNzLCB0aGUgcHJvcG9ydGlvbiBvZiB0aG9zZSBpbnRlcnZhbHMgdGhhdCBjb250YWluIHRoZSB0cnVlIHZhbHVlIG9mIHRoZSAKcGFyYW1ldGVyIHdpbGwgYmUgZXF1YWwgdG8gdGhlIGNvbmZpZGVuY2UgbGV2ZWwuCgpcLSBbV2lraXBlZGlhXShodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9Db25maWRlbmNlX2ludGVydmFsKQoKLS0tCgpDcmVhdGUgYSBmdW5jdGlvbiB0byBjb21wdXRlIGEgY29uZmlkZW5jZSBpbnRlcnZhbCBhc3N1bWluZyB0aGUgcG9wdWxhdGlvbiBpcwpub3JtYWxseSBkaXN0cmlidXRlZC4gQnkgZGVmYXVsdCwgYHNkID0gTkFgIGFuZCBgY29uZmlkZW5jZV9sZXZlbCA9IDAuOTVgLiBUaGUKZnVuY3Rpb24gd2lsbCB0YWtlIGEgbnVtZXJpYyB2ZWN0b3IgYHhgIGFuZCBwcm9kdWNlIGEgOTUlIGNvbmZpZGVuY2UgaW50ZXJ2YWwKdXNpbmcgYSB0LXRhYmxlIHZhbHVlIChzaW5jZSB0aGUgcG9wdWxhdGlvbiBzdGFuZGFyZCBkZXZpYXRpb24gaXMgbWlzc2luZykgd2l0aApkZWdyZWVzIG9mIGZyZWVkb20gZXF1YWwgdG8gdGhlIGxlbmd0aCBvZiBgeGAgLSAxLgoKV2hlbiB0aGUgcG9wdWxhdGlvbiBzdGFuZGFyZCBkZXZpYXRpb24gaXMga25vd24gYW5kIHN1cHBsaWVkLCB0aGUgY2FsY3VsYXRpb24gd2lsbApiZSBiYXNlZCBvbiBhIHN0YW5kYXJkIG5vcm1hbCB0YWJsZSB2YWx1ZS4KCmBgYHtyfQpjb25maWRlbmNlX2ludGVydmFsIDwtIGZ1bmN0aW9uKHgsIHNkID0gTkEsIGNvbmZpZGVuY2VfbGV2ZWwgPSAwLjk1KSB7CiAgYWxwaGEgPC0gMSAtIGNvbmZpZGVuY2VfbGV2ZWwKICBpZiAoIWlzLm5hKHNkKSkgewogICAgdGFibGVfdmFsdWUgPC0gcW5vcm0oYWxwaGEvMiwgbG93ZXIudGFpbCA9IEZBTFNFKQogIH0gZWxzZSB7CiAgICB0YWJsZV92YWx1ZSA8LSBxdChhbHBoYS8yLCBkZiA9IGxlbmd0aCh4KSAtIDEsIGxvd2VyLnRhaWwgPSBGQUxTRSkKICAgIHNkIDwtIHNkKHgpCiAgfQogIGNpIDwtIG1lYW4oeCkgKyBjKC0xLCAxKSAqIHRhYmxlX3ZhbHVlICogc2QgLyBzcXJ0KGxlbmd0aCh4KSkKICBuYW1lcyhjaSkgPC0gYygiTG93ZXIgQm91bmQiLCAiVXBwZXIgQm91bmQiKQogIGNpCn0KYGBgCgpOb3csIGxldCdzIGNhbGN1bGF0ZSBzb21lIGNvbmZpZGVuY2UgaW50ZXJ2YWxzIG9uIGtub3duIGRpc3RyaWJ1dGlvbnMuCgpgYGB7cn0KbiA8LSA1CnRydWVfbWVhbiA8LSAwCgojIERlZmluZSB0aGUgcG9wdWxhdGlvbiAtIGRyYXdpbmcgZnJvbSBhIHN0YW5kYXJkIG5vcm1hbCBkaXN0cmlidXRpb24KZGF0YSA8LSBybm9ybShuLCB0cnVlX21lYW4sIDEpCmRhdGEKYGBgCgpgYGB7cn0KaW50ZXJ2YWwgPC0gY29uZmlkZW5jZV9pbnRlcnZhbChkYXRhKQppbnRlcnZhbApgYGAKClRoaXMgaXMgb3VyIGNvbmZpZGVuY2UgaW50ZXJ2YWwgZm9yIHRoZSBtZWFuIG9mIHRoZSBzaW11bGF0ZWQgZGF0YSB3ZSBqdXN0IGNyZWF0ZWQuIFdlIGNhbiBjaGVjayB0byBzZWUgaWYgdGhpcyBpbnRlcnZhbCBpbmNsdWRlcyB0aGUgYHRydWVfbWVhbmAuCgpgYGB7cn0KaW50ZXJ2YWxbMV0gPCB0cnVlX21lYW4gJiYgdHJ1ZV9tZWFuIDwgaW50ZXJ2YWxbMl0KYGBgCgpOb3csIGxldCdzIHJ1biBpdCBhZ2FpbiEKCmBgYHtyfQpkYXRhIDwtIHJub3JtKG4sIHRydWVfbWVhbiwgMSkKZGF0YQoKaW50ZXJ2YWwgPC0gY29uZmlkZW5jZV9pbnRlcnZhbChkYXRhKQppbnRlcnZhbApgYGAKCldoeSBhcmUgdGhlc2UgY29uZmlkZW5jZSBpbnRlcnZhbHMgZGlmZmVyZW50PwoKIyMjIFNpbXVsYXRpb24KTm93IHdlJ2xsIHNpbXVsYXRlIHRob3VzYW5kcyBvZiBjb25maWRlbmNlIGludGVydmFscyEKCmBgYHtyfQpuX3JlcHMgPC0gMTAwMDAKYGBgCgoKIyMjIyBGb3IgTG9vcApgYGB7cn0KY29udGFpbnMgPC0gbnVtZXJpYyhuX3JlcHMpCmludGVydmFscyA8LSBtYXRyaXgoTkEsIG5yb3cgPSBuX3JlcHMsIG5jb2wgPSAyKQpmb3IgKGkgaW4gMTpuX3JlcHMpIHsKICBkYXRhIDwtIHJub3JtKG4sIHRydWVfbWVhbiwgMSkKICBpbnRlcnZhbCA8LSBjb25maWRlbmNlX2ludGVydmFsKGRhdGEpCiAgaW50ZXJ2YWxzW2ksXSA8LSBpbnRlcnZhbAogIGNvbnRhaW5zW2ldIDwtIGludGVydmFsWzFdIDwgdHJ1ZV9tZWFuICYmIHRydWVfbWVhbiA8IGludGVydmFsWzJdCn0KCmNvdmVyYWdlIDwtIG1lYW4oY29udGFpbnMpCmNvdmVyYWdlCmBgYAoKIyMjIyBSZXBsaWNhdGUKYGBge3J9CmludGVydmFscyA8LSByZXBsaWNhdGUobl9yZXBzLCB7CiAgZGF0YSA8LSBybm9ybShuLCB0cnVlX21lYW4sIDEpCiAgY29uZmlkZW5jZV9pbnRlcnZhbChkYXRhKQp9KQoKY29udGFpbnMgPC0gYXBwbHkoaW50ZXJ2YWxzLCAyLCBwcm9kKSA8IDAKCmNvdmVyYWdlIDwtIG1lYW4oY29udGFpbnMpCmNvdmVyYWdlCmBgYAoKIyMjIyBUaWR5dmVyc2UKYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQoKaW50ZXJ2YWxzIDwtIHJlcnVuKG5fcmVwcywgY29uZmlkZW5jZV9pbnRlcnZhbChybm9ybShuLCB0cnVlX21lYW4pKSkKCmNvbnRhaW5zIDwtIG1hcF9sZ2woaW50ZXJ2YWxzLCB+cHJvZCguKSA8IDApCgpjb3ZlcmFnZSA8LSBtZWFuKGNvbnRhaW5zKQpjb3ZlcmFnZQpgYGAKCgpMb29rIGF0IHRoYXQhIENsb3NlIHRvIC45NSAtIGlsbHVzdHJhdGluZyBleGFjdGx5IHdoYXQgdGhlIGRlZmluaXRpb24gb2YgYSBjb25maWRlbmNlIGludGVydmFsIHJlZmVycyB0by4gTm93LCBzaW5jZSB3ZSBhcnJpdmVkIGF0IHRoaXMgY2FsY3VsYXRpb24gdmlhIGEgc2ltdWxhdGlvbiwgd2UgY2FuIHB1dCBhIGNvbmZpZGVuY2UgaW50ZXJ2YWwgb24gb3VyIGNvdmVyYWdlIGVzdGltYXRlIGZvciBjb25maWRlbmNlIGludGVydmFscy4gTm93LCBub3RlIHRoYXQgaW4gdGhpcyBjYXNlIHdlJ3JlIGNyZWF0aW5nIGEgY29uZmlkZW5jZSBpbnRlcnZhbCBmb3IgYSBwcm9wb3J0aW9uOgoKJCQKcCBccG0gWl9cYWxwaGEgXHNxcnR7XGZyYWN7cCgxLXApfXtOfX0KJCQKCmBgYHtyfQpjb3ZlcmFnZSArIGMoLTEsIDEpICogcW5vcm0oMC45NzUpICogc3FydChjb3ZlcmFnZSAqICgxIC0gY292ZXJhZ2UpL25fcmVwcykKYGBgCgpOb3cgbGV0J3MgZ2V0IHNvbWUgdmlzdWFsIGludHVpdGlvbiBiZWhpbmQgdGhpczoKYGBge3J9CmludGVydmFsc19kZiA8LSBtYXBfZGYoaW50ZXJ2YWxzLCB+dGliYmxlKGxvd2VyID0gLlsiTG93ZXIgQm91bmQiXSwgdXBwZXIgPSAuWyJVcHBlciBCb3VuZCJdKSkgJT4lIAogIG11dGF0ZShpbmRleCA9IDE6bigpLAogICAgICAgICBpbmNsdWRlcyA9IGxvd2VyIDw9IHRydWVfbWVhbiAmIHVwcGVyID49IHRydWVfbWVhbikKYGBgCgpgYGB7ciwgZmlnLmhlaWdodD02LCBmaWcud2lkdGg9NH0KZ2dwbG90KGhlYWQoaW50ZXJ2YWxzX2RmLCAxMDApLCBhZXMoeCA9IGxvd2VyLCB4ZW5kID0gdXBwZXIsIHkgPSBpbmRleCwgeWVuZCA9IGluZGV4KSkgKwogIGdlb21fc2VnbWVudChsaW5lZW5kID0gImJ1dHQiLCBhZXMoY29sID0gaW5jbHVkZXMpLCBzaG93LmxlZ2VuZCA9IEZBTFNFLCBsd2QgPSAyKSArCiAgZ2VvbV92bGluZSh4aW50ZXJjZXB0ID0gdHJ1ZV9tZWFuLCBjb2wgPSAicmVkIiwgbHdkID0gMS41KSArCiAgdGhlbWVfYncoKSArCiAgbGFicyh0aXRsZSA9ICJDb25maWRlbmNlIEludGVydmFsIENvdmVyYWdlIiwKICAgICAgIHggPSAiIiwgCiAgICAgICB5ID0gIiIpCmBgYAoKCgojIyMgRXhwYW5kIHRoZSBzaW11bGF0aW9uCmBgYHtyfQpzaW11bGF0ZSA8LSBmdW5jdGlvbihjaV9mdW5jdGlvbiwgdHJ1dGggPSAwLCBuX3JlcHMgPSAxMDAwMCkgewogICMgY2lfZnVuY3Rpb24gcmV0dXJucyBhIGNvbmZpZGVuY2UgaW50ZXJ2YWwgZm9yIGEgZ2l2ZW4gc2V0IG9mIGRhdGEKICBjb3YgPC0gYXBwbHkocmVwbGljYXRlKG5fcmVwcywgY2lfZnVuY3Rpb24oKSAtIHRydXRoKSwgMiwgcHJvZCkgPCAwCiAgY292ZXJhZ2UgPC0gbWVhbihjb3YpCiAgIyBDb25maWRlbmNlIGludGVydmFsIGZvciBwcm9wb3J0aW9uCiAgY292ZXJhZ2UgKyBjKC0xLCAxKSAqIHFub3JtKDAuOTc1KSAqIHNxcnQoY292ZXJhZ2UgKiAoMSAtIGNvdmVyYWdlKS9uX3JlcHMpCn0KYGBgCgpgYGB7cn0Kc2ltdWxhdGUoZnVuY3Rpb24oKSBjb25maWRlbmNlX2ludGVydmFsKHJub3JtKDUpLCBzZCA9IDEpKQpzaW11bGF0ZShmdW5jdGlvbigpIGNvbmZpZGVuY2VfaW50ZXJ2YWwocm5vcm0oNSwgc2QgPSAxKSkpCnNpbXVsYXRlKGZ1bmN0aW9uKCkgY29uZmlkZW5jZV9pbnRlcnZhbChybm9ybSg1LCBzZCA9IDEpLCBzZCA9IDEuNSkpCmBgYAoKYGBge3J9CnNhcHBseShjKDUsIDEwLCA1MCwgMTAwLCAxMDAwKSwgZnVuY3Rpb24obl9vYnNlcnZhdGlvbnMpIHsKICBzaW11bGF0ZShmdW5jdGlvbigpIGNvbmZpZGVuY2VfaW50ZXJ2YWwocm5vcm0obl9vYnNlcnZhdGlvbnMpKSkKfSkKYGBgCgo=